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is bonded to the PEG region and comprises a group 
which can be bonded to a peptide by transpeptidation, 
to (b) a 3'-terminaI end of a coding molecule which is a 
nucleic acid comprising a 5' untranslated region comris- 
ing a transcription promoter and a translation enhancer; 
an ORF region which is bonded to the 3'-terminal side 
of the 5* untranslated region and encodes a protein; and 
a 3'-tenT»inal region that is bonded to'the 3'-terminal side 
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Description 

Technical Field 

5 [0001] With the advancement of genetic engineering, it has become possible to readily construct biopolymers such 
as nucleotides and peptides of which sequences are given, and in the protein engineering, it is being attempted to 
elucidate molecular structural and functional correlations as an approach for understanding biopolymer functions and 
intermolecular interactions. However, it is difficult to theoretically approach three-dimensional structures of them due 
to the diversity and complexity of the structures, and the attempt remains at the stage of modifying several residues 

10 in an active site and observing resultant changes in the stmcture and functions. In the evolutionary molecular engi- 
neering (Fushimi, J. (1991) Kagaku, 61 , 333-340; Fushimi, J. (1992) Koza Shinka, vol. 6, University of Tokyo Press), 
which has been recently highlighted, this difficult approach is not required, but selection pressure is applied based on 
a function to evolve proteins etc. in vitro or to detect Interactions between proteins or between a nucleic acid and a 
protein and analyze their networks (Miyamoto, E. & Yanagawa. H. (2000) Series: Post-Sequence Genome Science 3: 

15 Proteomics, pp. 136-166; Miyamoto, E. & Yanagawa, H. (2001 ) Tanpakush its u Kakusan Koso, 46(1): 1-10). 

[0002] The evolutionary molecular engineering of proteins based on evolutionary molecular engineering of RNA, 
which appeared in the 1990s, primarily aims at searching a vast sequence space, which can no way be contemplated 
in the conventional protein engineering, to select optimum sequences therefrom. As is symbolized by the fact that we 
still find "screening" useful even now in selection of useful proteins, molecular designing based on the structural theory 

20 is imperfect at present, and thus evolutional techniques have a practical value as more efficient techniques. 

[0003] The evolutionary molecular engineering also enables detection of intemiolecular interactions and analysis of 
the networks thereof by centering functions, and in particular, it is expected to be applied to functional analyses of 
genomes, which are increasingly becoming Important in recent years (Miyamoto, E. & Yanagawa, H: (2000) Series 
Post-Sequence Genomic Science 3: Proteomics, pp. 136-1 56; Miyamoto, E. & Yanagawa, H. (2001) Tanpakushitsu 

25 Kakusan Koso, 46(1), pp.1-1 0). Thus, the evolutionary molecular engineering and post-genome functional analyses 
of proteins can of course be applied to biotechnology such as utilization of biochips, biosensors and enzymes as 
industrial catalysts by modification of functional biopolymers and creation of biopolymers having functions which cannot 
be found in living organisms, and it can also be utilized in many industrial areas such as medicine, food, energy and 
environment including preparation of drugs by discovery of important bioenzymes etc. based on analysis of networks 

30 of interactions between proteins. 

Background Art 

[0004] The evolutionary molecular engineering is a science aiming at constructing a system that progressively 
35 evolves by repetition of three unit operations, "mutation", "selection" and "amplification" utilizing the Darwin's evolution 
mechanism and applying the system in engineering. The evolutionary molecular engineering was theoretically pro- 
posed by Eigen et al. in 1984, and it is new biotechnology molecular design of functional biopolymers is performed by 
in vitro high-speed molecular evolution, that is, by investigating mechanisms of adaptive locomotion of biopolymers in 
a sequence space and optimizing them in laboratory experiments (Fushimi J, (1991 ) Kagaku, 61 , 333-340; Fushimi J. 
40 (1992) Koza Shinka, vol. 6, University of Tokyo Press). 

[0005] As one of important elemental techniques in the evolutionary molecular engineering, "assigning of genotype 
to phenotype" can be mentioned. The following three types of the "assigning of genotype to phenotype" are frequently 
adopted in the natural world or evolutionary molecular engineering (Fushimi J. (1 999) Kagaku to Seibutsu , 37 , 678-684) : 

45 (a) the ribozyme type in which a portion corresponding to a genotype and a portion corresponding to a phenotype 

are carried on the same molecule; 

(b) the virus type in which a portion corresponding to a genotype and a portion corresponding to a phenotype form 
a complex; and 

(c) the cell type in which a portion corresponding to a genotype and a portion corresponding to a phenotype are 
50 contained in a single compartment. 

The evolutionary molecular engineering of RNA is of (a) the ribozyme type, and (b) the virus type or (c) the cell type 
is contemplated for the evolutionary molecular engineering of proteins. In the 1990s, RNA evolutionary molecular 
engineering was developed by Joyce and Szostak et al. (Joyce, G.F. (1989) Gene, 82, 83; Szostak, J.W. & Ellington 
55 A.D. (1990) Nature, 346, 818), and in vitro experimental RNA systems {in vitro selection systems) utilizing (a) the 
ribozyme-type assigning technique were proposed. Subsequently, the in wfro virus method (Nemoto, N., Miyamoto- 
Sato, E., Yanagawa, H. (1997) FEBS Lett., 414, 405; Yanagawa, H.. Nemoto, N., Miyamoto, E. (1998) W098/16636) 
RNA-peptide fusion method (Roberts, R. W., Szostak. J.W. (1997) Proc. Natl. Acad. Sci. USA, 94, 12297), STABLE 
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55 



method (Do., N. & Yanagawa. H. (1999) FEBS Lett.. 457. 227) and so forth have been reported as in vitro experimental 
systems forprotens(,hw/roselectionsystems)utilizingthev^ 

n the protein evolufonary molecular engineering. In addition, various techniques of the virus-type eVolutionCmoTec 
frfe ^ have been proposed so far Including the phage display (Smith. G.P. (1 985) Science. 228. 1 315-1317 
Scott. J.K. & Smith. G.P. (1990) Science. 249. 386-390). polysome display (Mattheakis. L.C. et al. (1994) Proc Natl" 
Acad. Sc.. USA. 91. 9022-9026), library with encoding tags (Brenner. S. & Lemer. R. A. (1992) Proc Nat Acad Sc 
rnnL'^V^ <^9«2) Rev. Sci. Instrum.. 53. 517-522) and so forth 
[0006] The sequence space size, that is. size of a library, searchable in the evolutional^ molecular engineerinq and 

Si^rr: TTt '""^ ^'^"^■'yP^ -'eoules?;he size of aJtoary ulg ^ 

he ll^^h T *^ the case of phage display, since the virus parasites a cell. One 

mlnnH /M f "k^ ^f. ^'™^-*yP^ assigning molecules can be constructed in vitro in the aforementioned in w/ro virus 
method (Nemoto, N.. Miyamoto -Sato, E.. Yanagawa. H. (1997) FEBS Lett.. 414. 405; Yanagawa, H.. Nemoto N & 
T-^T^ J ^'^^ A-P^Pt''^^ -method (Roberts. R.W.. Szostak, J.W. % 997) Proc N^l Acad 
nnlJf ^^^^^ ^T, ^° '"^'""'^^ theoretically expected as global searching methods for a se- 

onlv r"'P^;f'Jf ""'"^ ^•^^ ribozyme-type technique. Further, in the evolutionary molecular engineering, not 

only the size of searchable sequence space, but also its diversity is important. The polysome display method (Mat- 
rh^lenc^ Dower. W.J. (1995) W095/11922) is known, and this technique is suitable for a peptide wfth a ihort 
chain 'ength since nucleu: acid and a protein are bonded by a noncovalent bond via ribosome in this technique How- 
nZ:vT !" l l'L '^u^^ ^^^^ 'ong like a protein, handling thereof becomes problematic, that is. diversity o( 
Sno «S t '° T^** '^"^"^ °' 9^"°^P^ P^°'^'«'". » theoretically considered 

T'h H H? ' RNA-pept.de fusion method and so forth. However, in order to actually cor,struct a large-scale library 
and handle genotypes with a long chain length, several problems must be solved 

I0007J As described above, in principle, a large-scale library can be constructed by using virus-type assigning mol- . 
ecules ,n vara as m the in v^ro virus method, RNA-peptide fusion method and so forth. In practice howo.i tnlTi!! 
Z^L!^ ""^"f °" l^^ efficiency of construction of the virus-type assigning molecules. The vifus-type'assigning 

molecules are constructed by bonding a spacer containing puromyein to a nucleic acid sequence contafning protein 

on rZom^t? ^ i?f ""T^ '° ^ ^^""'yP^ » ^° « P^enotype molecule (protein) 

th ! r translation system. In this case, since a genotype molecule to which a spacer is not bonded 

that IS. a genotype molecule without puromyein. cannot be ligaled to a phenotype molecule, the spacer binding efficiencJ 
H^l"^ HM A M "^'^"^ ^'^"'P'^' RNA-peptide fusion method, a sprint and DNA Hgase are used to 

S ^''l'^^' However, 1 random residue that does not exist in a template may be often added to the 3'-temiinal 

end of the genotype molecule at the time of transcription. Thus, the sequence of the molecule does not match the 
spnnt sequence and hence ligation efficiency becomes poor. Accordingly, the sprint is modified, but much labor and 
cost are requ.red. Further, in the in vitro virus method. RNA Hgase is used to ligate a DNA spacer. Since RNA ligase 
does not require a spnnt. it has no such a problem as that of the DNA ligase. However. It is known that RNA ligase 
^nnnof ^i'^! enzymatic activity compared with DNA ligase, and its ligation elffciency is also poor 
tucJfi • 1° • °! 'he in vitro vims method and the RNA-peptide fusion method, the usable cell-free translation 
system .s l.mrted to a rabbit reticulocyte cell-free translation system. Further, the virus-type assigning molecule con- 

mobT. p wT.' " 'fT '^^"-'^^^ ^y^'^- remained'i low Is only itTlower 

(Roberts. R.W. & Szostak. J.W. (1997) Proc. Natl. Acad. Sci. USA. 94, 12297) or 10-/c or lower (Nemoto. N., Miyamoto- 
Sato E.. Yanagawa. H. (1 997) FEBS Lett.. 414, 405) of mRNA templates (genotype molecule) added to the cell-free 
f!'r,^J™')H ?r T"^^"" the efficiency Is increased to 20 to 40% by treatment after translation in the RNA-peptide 
fusion method (L.u, R., Barnck, E.. Szoztak, J.W., Roberts. R.W. (2000) Methods in Enzymology. 318 268-293) this 
requires much labor and time such as ordinary translation followedby addition of magnesium tons (Mg2-) and potassium 
ons (K^) and .ncubat.on at -20»C for 2 days. In the rabbit reticulocyte cell-free translation system in addiSon to t"^ 
ow assigning efficiency. mRNA stability is low. and therefore mRNA with a long chain cannot be handled. In contrast 
m a wheat gemi cell-free translation system, mRNA stability is favorable, and hence mRNA with a long chain can be 
handled Therefore, it is desirable that a virus-type assigning molecule can be constructed on ribosome in the wheat 
germ cell-free translation system, but this has not been realized so far. 

oort^l ^^'I determining the construction efficiency of virus-type assigning molecules, the most im- 

portant one is the difference in translation efficiency of the genotype molecule. This can be expected to largely depend 
on sequences of a transcription promoter and a translation enhancer in the 5' untranslated region (6' UTR) 3'-end side 
sequence and so forth of the genotype molecule. However, there has been no report on examination of the relationship 
between the translation efficiency and the virus-type assigning molecule construction efficiency 
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DiscJosure of the Invention 

[0010] An object of the present invention is to improve the scale and diversity of a library, which are important in the 
evolutionary molecular engineering and post-genome functional analysis, and to improve efficiency of construction of 

5 assigning molecules in order to achieve the above object. Further, another object of the present invention is to achieve 
higher efficiency and simplification of each step of ligation and assigning translation. Further, another object of the 
present invention is to construct a virus-type assigning molecule in a wheat germ cell-free translation system to enable 
mass synthesis and handling of long genotype molecules, which are advantages of use of wheat germ, and thereby 
establish a foundation for constructing a large library with high diversity in both of the evolutionary molecular engineering 

10 and genome functional analysis. 

[0011] The inventors of the present invention earnestly conducted investigations to achieve the aforementioned ob- 
jects. As a result, they found that the construction of the assigning molecule, which had been limited to the rabbit 
reticulocyte cell-free translation system so far, could be realized in the wheat germ cell-free translation system, the 
construction efficiency of the assigning molecules was improved, and a large-scale library with high diversity could be 

IS realized by improving a spacer containing puromycin (referred to as "spacer portion" hereinafter) and a nucleotide 
sequence containing protein information (referred to as "coding portion" hereinafter) in such an assigning molecule as 
shown in Fig. 1 . The expression "an assigning molecule" as shown in Fig. 1 used herein means an assigning molecule 
constructed by bonding a spacer portion containing puromycin to a coding portion by a certain method to obtain a 
genotype molecule and ligating it to a phenotype molecule (referred to as "decoded portion" hereinafter) on a ribosome 

20 in a cell-free translation system. 

[0012] Accordingly, the present invention provides the followings. 

(1 ) A spacer molecule comprising a donor region which can be bonded to a 3'-terminaI end of nucleic acid, a PEG 
region that is bonded to the donor region and comprises polyethylene glycol as a main component and a peptide 

2s acceptor region which Is bonded to the PEG region and comprises a group which can be t>onded to a peptide by . 

transpeptidation. 

(2) The spacer molecule according to (1 ), wherein the peptide acceptor region comprises puromycin or a derivative 
thereof, or puromycin or a derivative thereof and one or two residues of deoxyrlbonucleotides or ribonucleotides. 

(3) The spacer molecule according to (1) or (2), which comprises at least one function-imparting unit between the 
30 donor region and the PEG region. 

(4) The spacer molecule according to (3), wherein the function-imparting unit is at least one residue of functionally 
modified deoxyribonucleotide or ribonucleotide. 

(5) A coding molecule, which is a nucleic acid comprising a 5' untranslated region comprising a transcription pro- 
moter and a translation enhancer; an ORF region which is bonded to the 3'-terminat side of the 5' untranslated 

35 region and encodes a protein; and a 3*-terminal region which is bonded to the 3 -terminal side of the ORF region 

and comprises a polyA sequence and a sequence which a restriction enzyme Xho\ recognizes on the 5'-terminal 
side of the polyA sequence. 

(6) The coding molecule according to (5), wherein the transcription promoter is a promoter of SP6 RNA polymerase. 

(7) The coding molecule according to (5) or (6), wherein the translation enhancer is a part of the TMV. omega 
40 sequence of tobacco mosaic virus (029). 

(8) The coding molecule according to any one of (5) to (7), which comprises an affinity tag sequence in a portion 
downstream from the ORF region. 

(9) The coding molecule according to (8), wherein the affinity tag sequence is a Flag-tag sequence, which is a tag 
for affinity separation and analysis based on an antigen-antibody reaction. 

45 (10) A genotype molecule constructed by bonding a 3Memninal end of a coding molecule which is a nucleic acid 

comprising a 5' untranslated region comprising a transcription promoter and a translation enhancer; an ORF region 
which is bonded to the 3'-terminal side of the 5* untranslated region and encodes a protein; and a 3'-terminal region 
which is bonded to the 3*-terminal side of the ORF region and comprises a polyA sequence, to the donor region 
of the spacer molecule as defined in any one of (1) to (4). 

50 (11) The genotype molecule according to (10), wherein the transcription promoter is a promoter of SP6 RNA 

polymerase. 

(12) The genotype molecule according to (10) or (11), wherein the translation enhancer is a part of the TMV omega 
sequence of tobacco mosaic virus (029). 

(13) The genotype molecule according to any one of (10) to (12), wherein the 3'-terminal end sequence comprises 
55 a sequence which a restriction enzyme Xho\ recognizes on the SMerminal end side of the polyA sequence. 

(14) The genotype molecule according to any one of (10) to (13), which comprises an affinity tag sequence in a 
portion downstream from the ORF region. 

(15) The genotype molecule according to (14), wherein the affinity tag sequence is a Flag-tag sequence, which is 
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a tag for affinity separation and analysis based on an antigen-antibody reaction. 

(16) A method for constructing a genotype molecuJe, which comprises bonding (a) a 3'-terminal end of a coding 
molecule which is RNA comprising a 5* untranslated region comprising a transcription promoter and a translation 
enhancer: an ORF region which is bonded to the 3'-terminal side of the 5' untranslated region and encodes a 
protein; and a 3'-tenTjinal region which is bonded to the S'-terminal side of the ORF region and comprises a pblyA 
sequence, to (b) the donor region of the spacer molecule as define in any one of (1) to (4). which comprises RNA, 
by using RNA ligase in the presence of free polyethylene glycol having the same molecular weight as that of 
polyethylene glycol constituting the PEG region in the spacer molecule, 

(1 7) An assigning molecule constructed by ligating the genotype molecule as defined in any one of (10) to (15) to 
a phenotype molecule which is a protein encoded by the ORF region in the genotype molecule, by transpeptidation. 

(18) A method for constructing an assigning molecule, which comprises translating the genotype molecule as 
defined in any one of (10) to (15) in a cell-free translation system to ligate the genotype molecule to a phenotype 
molecule which is a protein encoded by the ORF region in the genotype molecule, by transpeptidation. 

(19) The method according to (18), wherein the cell-free translation system is a wheat germ cell-free translation 
'5 system. 

(20) The method according to (18), wherein the cell-free translation system is a rabbit reticulocyte cell-free trans- 
lation system. 

(21) A method for screening a nucleotide sequence encoding a protein which acts on a target substance, which 
comprises measuring an interaction between a decoded portion of an assigning molecule and the target substance 
by using a library comprising a plurality of the assigning molecules as defined in (1 7). among which at least a part 
of the assigning molecules have different sequences of the ORF regions in their coding portions, and detecting 
the nucleotide sequence of the coding portion of the assigning molecule exhibiting the interaction! 



Brief Description of the Drawings 
[0013] 



Fig. 1 is a schematic view showing structures of the assigning molecule, the spacer molecule and the coding 
molecule of the present invention. 

Fig. 2 shows a detailed constitution of an exemplary spacer molecule of the present invention, D: a donor region, 
X2 and XI: function -imparting units. PEG: a PEG region. A: a peptide acceptor region. Bio: biotin, and FV a 
fluorescent dye. 

Fig. 3 shows a detailed constitution of an exemplary coding molecule of the present invention. 

Fig. 4 shows results (electrophoresis photograph) of construction of the assigning molecules of the present inven- 

tion in a wheat gemn cell-free translation system. 

Fig. 5 shows stability of the genotype molecules of the present invention (ligation product of the spacer molecule 
and the coding molecule). A: comparison of stability of genotype molecules with different spacers, and B: compar- 
ison of stability of the genotype molecules of the present invention in the different translation systems. 
Fig. 6 shows changes in assigning efficiency by optimization of the acceptor region and the PEG region in the 
spacer molecule of the present invention. A: the case where the acceptor portion is dC-puromycin, and B: the case 
where the acceptor portion is dCdC-puromycin. 

Fig. 7 shows changes in translation efficiency by optimization of the coding molecule of the present invention. A: 
optimization of 5' UTR (the result of electrophoresis (photograph) is also shown), and B: optimization of the 3'- 
terminal region (3' tail). 

Fig, 8 shows the relationship between translation efficiency and assigning efficiency of the coding molecule of the 
present invention. 

Fig. 9 shows changes in ligation efficiency resulting from optimization of the coding molecule of the present inven- 
tion. A: optimization of the 3'-temiinal region (3' tail) (the result of electrophoresis (photograph) is also shown), 
and B: optimization of 5' UTR. 

Fig. 10 shows changes in ligation efficiency depending on the molecular weight of the PEG region in the spacer 
molecule of the present invention. 

Fig. 11 shows a synthesis scheme of the spacer molecule of the present invention. A shows a structure of a 
compound used for synthesis, and B shows synthesis steps. 

55 Best Mode for Carrying out the Invention 

[001 4J In the present specification, the term "assigning molecule" means a molecule that assigns a genotype to a 
phenotype. The assigning molecule is constructed by bonding a genotype molecule comprising nucleic acid having a 
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nucleotide sequence that reflects a genotype, to a phenotype molecule comprising a protein involved in expression of 
a phenotype. The genotype molecule is constructed by bonding a coding molecule that has the nucleotide sequence 
rejecting a genotype in such a fomr» that the nucleotide sequence can be translated, to a spacer portion. 
[00*15] A portion derived from the phenotype molecule, a portion derived from the spacer molecule and a portion 
derived from tne coding molecule in the assigning molecule are referred to as a decoded portion, a spacer portion and 
a coding portior., respectively. Further, a portion derived from the spacer molecule and a portion derived from the coding 
molecule in the genotype molecule are referred to as a spacer portion and a coding portion, respectively 
[001 6] Fig. 1 shows schematic constitutions of exemplary assigning molecule, spacer molecule and coding molecule 
of the present Invention. This assigning molecule comprises a spacer containing puromycin (referred to as "spacer 
portion") and a nucleotide sequence reflecting codes of a phenotype (referred to as "coding portion"). This assigning 
molecule is constructed by bonding the spacer portion containing puromycin to the coding molecule by a certain method 
to obtain a genotype molecule, and ligating the genotype molecule to a phenotype molecule on a ribosome in a cell- 
free translation system. The spacer molecule comprises a PEG region containing polyethylene glycol as a main com- 
ponent; a CCA region containing at least puromycin, or puromycin and DNA and/or RNA 1 or more residues; a donor 
region containing DNA and/or RNA of at least 1 residue; and further a function-imparting unit (X) comprising functionally 
modified DNA and/or RNA of at least 1 residue. The coding molecule comprises a 3'-terminal region that contains a 
polyA sequence of DNA and/or RNA comprising a sequence of a part of a decoded portion; 5' UTR that comprises 
DNA and/or RNA and contains a transcription promoter and a translation enhancer; and further an ORF region mainly 
comprising the sequence of the phenotype molecule. 

<1 > Spacer molecule of the present Invention 

[0017] The spacer molecule of the present invention comprises a donor region that can be bonded to a 3'-termtnal 
end of nucleic acid, a PEG region that is bonded to the donor region and comprises polyethylene glycol as a main 
component, and a peptide acceptor region that is bonded to the PEG region and comprises a group that can be bonded 
to a peptide by transpeptidation. 

[0018] The donor region that can be bonded to the 3'-tenminal end of nucleic acid usually comprises 1 or more 
nucleotides. The number of nucleotides is usually 1 to 15, preferably 1 to 2. The nucleotides may be ribonucleotides 
or deoxyribonucleotides. 

[0019] The sequence of the 5'-lerminal end of the donor region detemiines ligation efficiency. In order to ligate the 
coding portion and the spacer portion, at least 1 or more residues need to be contained. For the acceptor having the 
polyA sequence, at least 1 residue of dC (deoxycytidylic acid) or 2 residues of dCdC (dideoxycytidytic acid) are pre- 
ferred. The type of the nucleotide is more preferred in the order ofC>UorT>G>A. 

[0020] The PEG region comprises polyethylene glycol as a main component. The expression "comprising as a main 
component" used herein means that the total number of nucleotides contained in the PEG region is 20 bp or less, or 
the average molecular weight of polyethylene glycol is 400 or more. Preferably, it means that the total number of 
nucleotides is 10 bp or less, or the average molecular weight of polyethylene glycol is 1000 or more. 
[0021] The average molecular weight of polyethylene glycol in the PEG region is usually 400 to 30,000, preferably 
1 ,000 to 1 0.000, more preferably 2,000 to 8,000. When the molecular weight of polyethylene glycol is less than about 
400, a post-assigning translation treatment may be needed after the assigning translation of the genotype molecule 
containing a spacer portion derived from such a spacer molecule (Liu, R., Barrick, E., Szostak, J. W., Roberts, R. W. 
(2000) Methods in Enzymology, 318, 268-293). However, when PEG having a molecular weight of 1000 or more, more 
preferably 2000 or more is used, high efficiency assigning can be achieved only through the assigning translation, and 
therefore the post-translation treatment becomes unnecessary. Further, as the molecular weight of polyethylene glycol 
increases, the stability of the genotype molecule tends to increase and is favorable particularty with PEG having a 
molecular weight of 1000 or more. On the other hand, the genotype molecule may become unstable with PEG having 
a molecular weight of 400 or less, of which properties are not so different from those of a DNA spacer, 
[0022] The peptide acceptor region is not particularty limited so long as it can be bonded to the C-terminal of a 
peptide. For example, puromycin, 3'-N-aminoacylpuromycin aminonucleoside (PANS-amino acid) such as PANS-Gly 
of which amino acid portion is glycine, PANS-Val of which amino acid portion is valine and PANS-Ala of which amino 
acid portion is alanine and PANS-total amino acids of which amino acid portion corresponds to the total amino acids 
can be utilized. Further, 3'-N-aminoacyladenosine aminonucleoside (AANS-amino acid) bonded with an amide bond 
formed as a result of condensation of an amino group of 3'-aminoadenosine and a carboxyl group of an amino acid as 
a chemical bond such as AANS-Gly of which amino acid portion is glycine, AANS-Val of which amino acid portion is 
valine and AANS-Ala of which amino acid portion is alanine and AANS-total amino acids of which amino acid portion 
corresponds to the total amino acids can be utilized. Further, nucleoside, a bonding product of nucleoside and amino 
acid via an ester bond orthe like can be utilized. In addition, any substance having a bonding scheme that can chemically 
bond a nucleoside or a substance having a chemical structure similar to that of nucleoside and an amino acid or a 
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substance having a chemical structure similar to that of amino acid can be utilized. 

[0023] The peptide acceptor region preferably comprises puromycin or a derivative thereof, or puromycin or a de-- 
rlvative thereof and 1 or 2 residues of deoxyribonucleotides or ribonucleotides. The term "derivative" used herein means 
a derivative that can be bonded to the C-terminal of peptide in a protein translation system. The puromycin derivatives 

5 are not limited to those having a complete puromycin structure, and include those lacking a part of the puromycin 
structure. Specific examples of the puromycin derivatives include PANS-amino acid, AANS-amino acid and so forth. 
[0024] Although it is sufficient that the peptide acceptor region is constituted by puromycin alone, it preferably has 
a nucleotide sequence conriprising 1 or more residues of DNA and/or RNA on the S'-temiinal side. Examples of the 
sequences include dC-puromycin and rC-puromycin. and dCdC-puromycin, rCrC -puromycin, rCdC-puromycin, dCrC- 

10 puromycin and so forth are more preferred. The CCA sequences that imitate the 3'-terminal end of aminoacyl-tRNA 
(Philipps. G. R, (1969) Nature, 223, 374-377) are suitable. The type of nucleotide is more preferred in the order of C 
> U orT > G > A. 

[0025] The spacer molecule preferably contains at least one function-imparting unit between the donor region and 
the PEG region. The function-imparting unit preferably comprises at least 1 residue of functionally modified deoxyri- 

15 bonucleotide or ribonucleotide. For example, as a functionally modified substance, those having various separation 
tags such as a fluorescent substance, biotin or His-tag shown in Fig. 2 introduced can be used. 
[0026] Fig. 2 shows a detailed constitution of an exemplary spacer molecule. The spacer molecule comprises a PEG 
region containing polyethylene glycol as a main component; a CCA region comprising puromycin or puromycin and 
DNA and/or RNA of at least 1 residue; a donor region containing ONA and/or RNA of at least 1 or more residues; and 

20 further a function-imparting unit (X) comprising DNA and/or RNA of which at least 1 residue of nucleotide is functionally 
modified. In Fig. 2, a fluorescent substance T(F1) and biotin T(Bio) are used as the function-imparting unit (X). 

<2> Coding molecule of the present invention 

25 [0027] The coding molecule of the present invention is a nucleic acid comprising a 5' untranslated region comprising 
a transcription promoter and a translation enhancer; an ORF region that is bonded to the 3'-terminal side of the 5* 
untranslated region and encodes a protein; and a 3'-terminai region that is bonded to the 3' -terminal side of the ORF 
region and contains a polyA sequence and a sequence which a restriction enzyme Xho\ recognizes on the 5'-terminal 
side of the polyA sequence, 

30 [0028] The coding molecule may be DNA or RNA. In the case of RNA, the 5'-terminal end may have a Cap structure 
Or not. Further, the coding molecule may be incorporated in an arbitrary vector or plasmid, 

[0029] The 3'-terminal region contains the Xhol sequence and a polyA sequence downstream therefrom. As a factor 
affecting the ligation efficiency of the spacer molecule and the coding molecule, the polyA sequence in the 3'-terminal 
region is important. The polyA sequence is a polyA continuous chain comprising at least 2 or more residues of a mixture 
35 of dA and rA or either thereof, preferably a polyA continuous chain of 3 or more residues, more preferably 6 or more 
residues, further preferably 8 or more residues. 

[0030] As a factor affecting translation efficiency of the coding molecule, there can be mentioned a combination of 
the 5* UTR comprising a transcription promoter and a translation enhancer and the 3-terminal region containing a 
polyA sequence. The effect of the polyA sequence in the 3*-temninal region is usually exhibited when it comprises 10 

40 or less residues. The transcription promoter of the 6' UTR is not particularly limited, and T7/T3, SP6 or the like can be 
used. SP6 can be preferably used, and when the omega sequence or a sequence containing a part of the omega 
sequence is used as a translation enhancer sequence, in particular, SP6 is particularly preferred. The translation en- 
hancer is preferably a part of the omega sequence. As the part of the omega sequence, a part of the TMV omega 
sequence (029; Gallie D.R. & Walbot V. (1992) Nucleic Acids Res.. 20, 4631-4638) is prefen-ed. 

45 [0031 ] Further for the translation efficiency, a combination of the Xhol sequence and the polyA sequence is important 
in the 3'-terminal region. Further, a combination of a sequence having an affinity tag in a portion downstream from the 
ORF region, i.e., upstream of the Xhol sequence, and the polyA sequence is also important. The affinity tagged se- 
quence is not particularly limited so long as it is a sequence for using any means that can detect a protein such as an 
antigen-antibody reaction. A Flag-tag sequence, which is a tag for affinity separation/analysis based on an antigen- 

50 antibody reaction, is preferred. As the effect of the polyA sequence, the translation efficiency of an affinity tag such as . 
a Flag tag having the Xho\ sequence bonded thereto and further bonded with a polyA sequence is improved. 
[0032] The structure effective for the translation efficiency is also effective for the assigning efficiency. 
. [0033] The ORF region may be any sequence comprising DNA and/or RNA. The sequence is not limited, and it may 
be a gene sequence, exon sequence, inlron sequence or random sequence, or it can be any sequence in the natural 

55 world or any artificial sequence. Further, when SP6 + 029 are used as the 5' UTR of the coding molecule, and Flag + 
Xho\ + Ap (n = 8) are used as the 3'-terminal region, for example, the length of the 5' UTR is about 60 bp and that of 
the 3'-terminal region is about 40 bp. These are lengths that allow incorporation of them into a PGR primer as adaptor 
regions. Therefore, the coding molecule of the present invention having the 5' UTR and the 3'-temiinal region can be 
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readily constructed by PGR using any vector, plasmid or cDNA library. In the coding molecule of the present invention 
translation may be carried out exceeding the ORF region. That is, a termination codon does not need to exist at the 
end of the ORF region. 

£0034] Fig. 3 shows a detailed constitution of an exemplary coding molecule. The coding molecule comprises the 
3'-temiinal region; 5' UTR containing a transcription promoter and a translation enhancer comprising DNA and/or RNA- 
and an ORF region comprising a sequence information of a decoded portion, that is, encoding a phenotype protein 
In Fig. 3. the 3"-terminal region contains an affinity tag sequence comprising DNA or RNA. Xho\ sequence and polyA 
sequence, and a Flag-tag sequence is used. As the 5" UTR. a sequence containing SP6 as a transcription promoter 
and 029. which is a part of the omega sequence, as a translation enhancer is used. 

<3> Genotype molecule of the present invention and method for constructing the same 

[0035] The genotype molecule of the present invention is constructed by bonding the 3'-terminal end of a coding 
molecule, which is a nucleic acid comprising a 5' untranslated region comprising a transcription promoter and a trans- 
lation enhancer; an ORF region that is bonded to the 3'-tenmlnal side of the 5' untranslated region and encodes a 
protein; and a S'-temiinal region that is bonded to the S'-tenminal side of the ORF region and comprises the polvA 
sequence, to a donor region of the spacer molecule of the present invention. 

[0036] The coding molecule constituting the genotype molecule of the present invention is the same as described 
above about the coding molecule of the present invention except that the Xho\ sequence is not essential in the afore- 
mentioned coding molecule of the present invention. However, it preferably to have the Xhoi sequence. 
[0037] The genotype molecule of the present invention can be constructed by bonding the S'-temiinal end of the 
aforementioned coding molecule and the donor region of the spacer molecule by an ordinary ligase reaction As the 
reaction conditions, conditions of 4 to 2500 and 4 to 48 hours can be mentioned. If polyethylene glycol having the same 
molecular weight as that of polyethylene glycol in the PEG region of the spacer molecule containing the PEG region 
IS added to a reaction system, the reaction time can be reduced to 0.5 to 4 hours at 15'C. 

10038] The combination of the spacer molecule and the coding molecule has an important effect on. ligation efficiency. 
The S'-teiminal region of the coding portion, which corresponds to an acceptor, preferably contains at least 2 or more 
residues, preferably 3 or more residues, more preferably 6 to 8 or more residues of the polyA sequence of DNA or 
RNA. Further, as the translation enhancer of 5' UTR. a partial sequence of the omega sequence (029 Fig 3) is 
prefen-ed. As the donor region of the spacer portion. dC (deoxycytidylic acid) of at least 1 residue or dCdC (dideoxy- 
cytidylic acid) of 2 residues is preferred. This makes it possible to use RNA ligase to avoid the problems of DNA liqase 
and maintain 60 to 80% of the efficiency. 

[0039] When the genotype molecule is RNA, it is preferable to bond (a) the 3'-terminal end of a coding molecule 
comprising a S' untranslated region comprising a transcription promoter and a translation enhancer an ORF region 
that IS bonded to the 3'-tenninal side of the 5' untranslated region and encodes a protein, and a 3'-terminal region that 
IS bonded to the 3'-terminal side of the ORF region and comprises the polyA sequence, to (b) a donor region of the 
spacer molecule as defined in any one of (1 ) to (4) comprising RNA by using an RNA ligase in the presence of free 
polyethylene glycol having the same molecular weight as that of polyethylene glycol constituting the PEG reqion in the 
spacer molecule. 

40 10040] At the time of the ligation reaction, by adding polyethylene glycol having the same molecular weight as that 
of the PEG region m the spacer portion containing the PEG region, ligation efficiency can be improved to 80 to 90% 
or more in-espective of the molecular weight of polyethylene glycol in the spacer portion, and the separation process 
after the reaction can be omitted. 
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<4> Assigning molecule of the present invention and method for constructing the same 

[0041] The assigning molecule of the present invention is constructed by ligating the aforementioned genotype mol- 
ecule of the present invention to a phenotype molecule which is a protein encoded by the ORF region In the qenotvoe 
molecule, by transpeptidation. 

so [0042] The assigning molecule of the present invention is also constructed by translating the genotype molecule of 
the present invention in a cell-free translation system to ligate it to a phenotype molecule which is a protein encoded 
by the ORF region in the genotype molecule, by transpeptidation. 

[0043] The cell-free translation system is preferably of wheat germ or rabbit reticulocyte. The translation conditions 
may be usually adojated conditions. For example, conditions of 25 to 37»C and 1 5 to 240 minutes can be mentioned 
[0044] As for the cell-free translation system, the construction of the assigning molecules was examined in systems 
of Escherichia coli(E. coll), rabbit reticulocyte and wheat germ so far, and the assigning molecules were confirmed 
only in the rabbit reticulocyte system (Nemoto, N.. Miyamoto-Sato, E.. Yanagawa, H. (1997) FEBS Lett 414 405- 
Roberts. R.W., Szostak. J.W. (1997) Proc. Natl. Acad. Sci. USA, 94, 12297). However, according to the present inven- 
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tion. the assigning molecule can also be constructed In a wheat germ system as an assigning molecule having a spacer 
portion containing the PEG region. Further, the rabbit reticulocyte system was not so practical due to lack of stability 
of genotype molecule and conventionally applied only to genotype molecules having a short chain length (Roberts R 
W.. Szostak. J.W. (1997) Proc. Natl. Acad. Sci. USA. 94, 12297; Nemoto. N.. Miyamoto-Sato, E.. Yanagawa H (1997) 
FEBS Lett., 414, 405). However, the assigning molecule of the present invention having a spacer portion containing 
the PEG region is more stable in the wheat geim system, and therefore the wheat germ system is a practically useful 
system where molecules having a long chain length can be handled. 

<5> Screening method of the present invention 

[0045] The screening method of the present invention is a method for screening a nucleotide sequence encoding a 
protein that acts on a target substance, which comprises measuring interactions between a decoded portion of an 
assigning molecule and the target substance by using a library comprising a plurality of the assigning molecules as 
defined in (17), among which at least a part of the assigning molecules have different sequences of the ORF regions 
in their coding portions, and detecting the nucleotide sequence of thecoding portion of the assigning molecule exhibitino 
the interaction. " 

10046] The aforementioned library can be constructed according to a usual method for constmcting a library com- 
prising assigning molecules except that the assigning molecules of the present invention are used as the assigning 
molecules. For example, in the evolutionary molecular engineering, there can be used libraries constructed by em- 
ploying the methods of Error-prone PCR (Leung. D.W.. et al. (1989) J. Methods Cell Mol. Biol 1 11-15) Sexual PGR 
(Stemmer. W.P.C. (1994) Proc. Natl. Acad. Sci. USA. 91, 10747-10751), ONA shuffling, mutation librar^ (Yanagawa 
H. & Tsuji. Y -Mutant DNA Library Constnjction Method"; Japanese Patent Application No. 2000-293692) and so forth' 
In the genome functional analysis, a cDNA library constmcted by random priming, dT priming or the like can be used 
(0047J As the target substance, proteins (including peptide, antibody etc.). nucleotides and so forth can be mentioned 

Interactions can be measured by a method suitable for the type of the target substance (for example Riqaut G etal 
(1999) Nature Biotech., 17, 1030-1032). r . » . . «> 

[0048] The nucleotide sequences can be detected by a usual method. For example, amplification by PCR or the like 
can be mentioned. For example, the nucleotide sequences can be amplified by RT-PCR or the like (Joyce G F (1989) 
Gene, 82, 83; Szostak, J.W. & Ellington. A. D. (1990) Nature, 346. 818). 

<6> Effect of the present invention 

[0049] The spacer molecule of the present invention serves as a spacer portion in an assigning molecule and is an 
improved spacer containing polyethylene glycol as a main component compared with a conventional spacer containing 
DNA as a mam component. This makes possible to constmct the assigning molecule not only in a rabbit reticulocyte 
translation system but also in a wheat gemi cell-free translation system. mari<edly improve stability of the genotype 
molecule in both of the translation systems, and makes any post-translation treatment unnecessary. 
[0050] It is epoch-making that the construction of the assigning molecule conventionally limited to the rabbit reticu- 
locyte cell-free translation system (Nemoto, N.. Miyamoto-Sato, E.. Yanagawa, H. (1 997) FEBS Lett 414 405 Roberts 
R.W., Szostak. J.W. (1997) Proc. Natl. Acad. Sci. USA, 94, 12297) can be realized in a wheat gerni cell-free translation 
system by using a spacer comprising an PEG region (PEG) containing polyethylene glycol as a main component 
instead of a conventional spacer containing DNA as a main component (Liu, R., Ban-ick. E., Szostak, J.W., Roberts 
R. W. (2000) Methods in Enzymology, 31 8. 268-293; Nemoto, N., Miyamoto-Sato, E., Yanagawa H (1 997) FEBS Lett 
414, 405; Roberts, R.W, Szostak. J. W. (1997) Proc. Natl. Acad. Sci. USA, 94, 12297). Further, in comparison with 
those using a conventional DNA spacer (S30 spacer. Liu R., Barrick E.. Szoztak J.W., Roberts, R W (2000) Methods 
in Enzymology, 318, 268-293) or a spacer of containing DNA as a main component having a polyethylene glycol mo- 
lecular weight of 400 or less (F30 spacer Liu R., Barrick E., Szoztak J. W., Roberts. R.W. (2000) Methods in Enzymology 
318, 268-293), the genotype molecule containing the spacer portion of the present invention has significantly high 
stability in both of the rabbit and wheat translation systems and hence does not require post-treatment after assigning 
translation for improving assigning efficiency. Since the treatment after the assigning translatfon can be simplified the 
wort<ing time for the assigning translation can be shortened from 48 to 72 hours to 0.5 to 1 hour. Further since the 
stability of the genotype molecule is higher in the wheat gemi cell-free translation system than in the rabbit reticulocyte 
cell-free translation system, the stability of genotype molecule can be made higher by the spacer portion containing 
the PEG region (PEG) as a main component in two ways. Therefore, it becomes possible to construct a library including 
coding portions having a long chain length and thereby increase diversity of the library. 

[0051] By introducing a fluorescent substance into nucleotide of dT as a function-imparting unit (X) the assigning 
molecule, which is conventionally detected by using a radioisotope (Rl) with much labor and time, can be readily 
detected based on fluorescence. Further, the assigning molecule can be separated and purified from a cell-free protein 
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synthesis system by introducing biotin or any of various tags. 

[0052] the reason why the stability of the genotype molecule containing a spacer portion derived from the spacer 
molecule of the present inv? on Is high Is copisldered that, since the 3*-temiinai side is protected by a spacer portion 
having polyethylene glycol \v i j no electric chr.r.^e, mRNA having the spacer portion is prevented from being attacked 

5 by nuclease from the 3'-termmal side. Further, the reason why the post-translation treatment becomes unnecessary, 
and the assigning molecule can be constructed not only in a rabbit reticulocyte cell-free translation system, but also 
in a wheat germ cell -free translation system is considered that, since a spacer containing PEG as a main component 
causes no interaction with ribosome, other proteins or nucleic acids included in the translation system, degree of free- 
dom for the spacer is increased not only in the rabbit reticulocyte cell-free translation system, but also in the wheat 

10 germ cell-free translation system in comparison with a DNA spacer, and puromycin of the spacer quickly enters into 
the A site on ribosome to accelerate a ligation reaction between puromycin and a protein. 

[0053] The coding molecule of the present invention serves as the coding portion of the assigning molecule, and 
specific sequences can be used for the 5'-terminal and 3'-terminal sides of the coding molecule improve translation 
efficiency and further Improve the assigning efficiency by 4 to 5 times. 

15 [0054] Effect of improving the translation efficiency is exhibited with a constitution comprising 5' UTR comprising a 
transcription promoter and a translation enhancer, comprising DNA and/or RNA; an ORF region comprising the main 
sequence of a decoded portion; and further a 3'-terminal region comprising the polyA sequence, comprising DNA and/ 
or RNA. It is known that a polyA sequence stabilizes mRNA, and its average length is said to be several hundred bp 
in euk&ryote. However the polyA sequence having a length of 1 0 bp or less is particularly effective in the present 

so invention. Further, the polyA sequence is generally a sequence contained in 3'UTR and is not translated. However, in 
the present invention, the polyA sequence may be translated to constitute a part of the decoded portion. Further, it is 
generally known that translation efficiency is improved with a full-length omega sequence as for the enhancer sequence 
compared with a short omega sequence (Gallie. D.R., Walbot. V. (1 992) Nucleic Acids Res., 20, 4631 -4638). However, 
in the present invention, a short omega sequence (029 in Fig. 3) shows better translation efficiency to the contrary. 

25 The reason for such novel effect is considered that particular effects such as better stability of a genotype molecule 
containing the 029 sequence compared with that of a genotype molecule containing a full-length omega sequence 
may possibly result from a combination of the 5* UTR and the S'-terminal region. In actua! examinations, the genotype 
molecule was more stable when the 5* LfTR contained 029 containing a part of the omc-aii sequence rather than the 
full-length omega sequence. 

30 [0055] By combining the spacer molecule and the coding molecule of the present invention, high ligation efficiency 
of the spacer portion can always be realized without depending on the coding portion. 

[0056] As the effect that is not exhibited with the coding portion alone or the spacer portion alone, marked increase 
of ligation efficiency of the spacer portion and the coding portion is mentioned. The ligation efficiency of the coding 
portion and the spacer portion is conventionally 40% or less (Nemoto, N., Miyamoto-Sato, E., Yanagawa, H. (1997) 

35 FEBS Lett., 414, 405; Roberts R.W., Szostak, J.W, (1 997) Proc. Natl. Acad. Sci. USA, 94. 12297). However, the ligation 
efficiency affected by the 3'-terminal end sequence of the coding portion is always made high by the third aspect of 
the present invention. For the ligation reaction, a sprint and DNA tigase have been conventionally used as means for 
bonding a spacer and a genotype in the RNA-peptide fusion method (Roberts, R.W. & Szostak, J. W. (1997) Proc. 
. Natl. Acad. Sci. USA, 94, 12297). However, upon transcription, 1 random residue that does not exist in a template may 

40 be often added to the 3'-terminal end of the genotype, and hence the 3x?quence of the genotype is not complementary 
to the sequence of the sprint. As a result, the spacer bonding efficiency was not favorable. Further, in the in v/fro virus 
method (Nemoto, N.. Miyamoto-Sato. E., Yanagawa, H. (1997) FEBS Lett.. 414, 405), RNA ligase is used as a method 
for bonding the spacer and the genotype. The aforementioned problems do not occur with RNA ligase since the sprint 
is unnecessary. However, it is known that RNA ligase originally has lower enzymatic activity compared with DNA ligase, 
and its spacer bonding efficiency was also unfavorable. For use of RNA ligase as a ligation enzyme, there are known 
nucleotides desirable and undesirable for the nucleotide sequence acceptor and the donor nucleotide sequence (Uh- 
lenbeck, O.C., Gumport. R. I. (1 982) The Enzymes, vol. XV, 31-58; England, TE.. Uhlenbeck, O.C. (1978) Biochemistry 
17, 2069-2076; Romaniuk, E., McLaughlin. L.W., Neilson, T, Romaniuk, P.J. (1982) Eur. J. Biochem., 125, 639-643). 
However, all of these experiments were conducted by using those with short chains, and no report has shown that high 

so ligation efficiency is always obtained with a long acceptor or donor as in the present invention. 

[0057] As for the reason why the ligation efficiency is improved by the present invention, it is known that the longer 
the donor side nucleic acid becomes, the lower the ligation efficiency becomes when RNA ligase or DNA ligase is used 
(Uhlenbeck, O.C. & Gumport, R.I. (1982) The Enzymes, vol. XV, 31-58). Therefore, it is possible that a high ligation 
efficiency similar to that of ligation with nucleic acid (donor) having a short chain length is achieved by using polyethylene 

55 glycol with no electric charge instead of DNA as a main component of the spacer, and using dC of 1 residue (deoxy- 
cytidylic acid) or dCdC (dideoxycytidylic acid) of 2 residues are contained in the donor region. However, the efficiency 
may decline depending on the sequence of the 3'-terminal end sequence region of the coding portion, but this problem 
was solved by providing the polyA sequence in the 3*-terminal region. This sequence also contributes to the improve- 
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ment of translation efficiency. A ligation efficiency of 60 to 80% or higher can be obtained by the using a polyA sequence 
irrespective of sequences upstream therefrom. Further, when the molecular weight of polyethylene glycol of the PEG 
region is increased, the ligation efficiency may decline. However, as to this problem, the ligation efficiency of 80 to 90% 
or higher can be obtained, irrespective of the molecular weight of polyethylene glycol, by adding polyethylene glycol 

5 having the same molecular weight as that of the PEG region in the spacer portion and, in particular, adjusting the 
mixing ratio of the coding portion and the spacer portion in the ligation reaction. It is considered that the ligation reaction 
of the 3'-terminal end of the coding portion and the 5'-terminal end of the spacer portion is prompted because viscosity 
in the reaction field Is increased due to the existence of polyethylene glycol. As described above, since sufficiently high 
ligation efficiency is obtained, separation treatment conducted after the reaction in the ligation process becomes un- 

10 necessary, and conventional working time of 48 to 72 hours (Nemoto, N., Miyamoto-Sato, E., Yanagawa, H. (1997) 
FEBS Lett.. 414, 405) was shortened to 4 to 8 hours. Consequently, it becomes possible to efficiently and readily 
construct a large-scale library. Further, assigning efficiency is marl<edty improved by further using the constitution of 
5' UTR in the coding portion. 

[0058] Thus, contrary to the conventional theory (Liu, R., Barrick, E., Szostak, J.W., Roberts, R.W. (2000) Methods 
15 in Enzymology, 318, 268-293), the inventors of the present invention practically realized a large-scale library with higher 
diversity not only in a rabbit reticulocyte cell-free translation system, but also in a wheat germ cell-free translation 
system. 

Examples 

20 

[0059] Hereafter, examples of the assigning molecules of the present invention will be specifically described. How- 
ever, the following examples are construed as being an aid for obtaining concrete knowledge of the present invention, 
but in no way limit the scope of the present invention. 

25 Example 1 

[0060] The spacer molecule and the coding molecule shown in Fig. 1 were ligated by a ligation reaction to construct 
a genotype molecule comprising a spacer portion derived from the spacer molecule and a coding portion derived from 
the coding molecule, and this genotype molecule and a phenotype molecule were ligated via puromycin on ribosome 
30 by assigning translation to prepare an assigning molecule comprising the coding portion and spacer portion derived 
from the genotype molecule and a decoded portion derived from the phenotype molecule. Details will be described 
below. 

(1) Synthesis of spacer molecule containing PEG region 

35 

[0061] The spacer molecule containing a PEG region was synthesized by using the method outlined in Fig. 11 , B. 
Compound 1 was synthesized by the method reported by Ikeda et al. (Ikeda, S. et al. (1998) Tetrahedron Lett., 39, 
5976-5978). The structures of the used nucleotide phosphoramidites (phosphoramidltes providing dC, T(F1) and T 
(Bio)). PEG phosphoramidlte and chemical phosphorylating agent are shown in Fig. 1 1 , A. The nucleotide phosphor- 
ic amidites and the chemical phosphorylating agent were purchased from Glen Research (Virginia, USA). Polyethylene 
glycol (PEG) having average molecular weights of 1 000, 2000 and 3000 were purchased from NOF Corporation (Tokyo, 
Japan). PEG having an average molecular weight of 4000 was purchased from Furka (Switzerland). PEG phosphor- 
amidites were synthesized by the method reported by Jaschke et al. (Jaschke, A. et at. (1993) Tetrahedron Lett., 34, 
301-304). In Fig. 11, DMTr represents 4,4'-dimethoxytrityl group, and Fmoc represents 9-flu6renemethoxycarbonyl 
45 group, 

[0062] The following treatments A to D were performed for Compound 1 (400 mg, containing 10 fimol of puromycin 
residue) according to a predetermined sequence until a predetermined number of nucleotides and PEG were intro- 
duced. 

50 A. Add 1 mL of 3% trichloroacetic acid solution in methylene chloride, and leave it room temperature for 3 minutes 

and then wash it 3 times with 5 mL of methylene chloride. Repeat the same procedure, and then wash the reaction 
product 5 times with 5 mL of anhydrous acetonitrile. 

B. Add, 30 p.mol of nucleotide phosphoramidlte, PEG phosphoramidite or chemical phosphorylating agent, 1 00 |xL 
of 0.457 M tetrazole solution in anhydrous acetonitrile and 1 mL of anhydrous acetonitrile and shake it at room 

55 temperature for 15 minutes. Wash the reaction product 5 times with 5 mL of acetonitrile. 

C. Add 1 mL of 50 mM iodine solution (tetrahydrofuran/pyridine/water = 75:20:5), and leave it at room temperature 
for 3 minutes and wash 3 times with 5 mL of pyridine. Repeat the same procedure, and then wash it the reaction 
product 5 times with 5 mL of anhydrous pyridine. 
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D. Add 1 mL of 1 0% anhydrous acetic acid solution in pyridine and a catalytic amount of 4,4-dimethyIaminopyridine, 
and leave it at roonn tennperature for 20 minutes and wash it 5 times with S.mL of pyridine and 5 times with 5 mL 
of methylene chloride. 

[0063] To the Compound 1 to which a predetemiined number of nucleotides and PEG are introduced according to 
a predetermined sequence by the above treatments. 1 .5 mL of concentrated aqueous ammonia and 0.5 mL of ethanol 
were added and the mixture was shaken at room temperature for 14 hours. A solid phase carrier (CPG) was removed 
by filtration, and the filtrate was lyophilized. The residue was purified by HPLC [column: YMC pack ODS-A SH-343-5 
produced by YMC (Kyoto, Japan), eluent: a linear concentration gradient of 10 to 60% acetonitrile in 0.1 M aqueous 
triethylammonium acetate (pH 7.0) over 30 minutes, flow rate: 10 mUmin) to obtain a spacer molecule containing a 
PEG region. 

[0064] The types and yields of the obtained spacer molecules containing a PEG region were as follows. Meanings 
of the abbreviations are as follows; p: phosphorus group, dC: deoxycytidine. PEG (number): PEG having an average 
molecular weight represented by the number. Puro: puromycin, T(FI): thymidine labeled with a fluorescent dye. T(Bio): 
thymidine labeled with biotin. 



p(dCp)2pEG(1000)p(dCp)2Puro. yield 8.7% 
p(dCp)2T(Fl)pPEG{1000)pdCpPuro, yield: 62% 
p(dCp)2T(FI)pPEG(1000)p(dCp)2Puro, yield: 14% 
p(dCp)2PEG{2000)p(dCp)2Puro, yield: 7% 
p(dCp)2T(FI)pPEG(2000)pdCpPuro, yield: 30% 
p(dCp)2T(FI)pPEG(2000)p(dCp)2Puro, yield: 27% 
p(dCp)2T(Bio)pPEG(2000)pdCpPuro. yield: 9% 
p(dCp)2T(Bio)pPEG(2000)p(dCp)2puro, yield: 8% 
p(dCp)2T(Bio)pT(FI)pPEG(2000)pdCpPuro, yield: 2% 
p(dCp)2T(Bio)pT(FI)pPEG(2000)p(dCp)2Puro. yield: 8% 
p(dCp)2PEG(3000)pdCpPuro, yield: 2% 
p(dCp)2PEG(3000)p(dCp)2Puro, yield: 22% 
p{dCp)2T(FI)pPEG(3000)pdCpPuro. yield: 29% 
p(dCp)2T(R)pPEG(3000)p(dCp)2Puro, yield: 23% 
p(dCp)2T(R)pPEG{4O00)pdCpPuro, yield: 16% 
p(dCp)2T(FI)pPEG(4000)p{dCp)2Puro. yield: 1 7% 

(2) Preparation of coding molecule 



[0065] A coding molecule was amplified by PCR using a DN A template containing the sequence of the mouserderived 
c-jun (Gentz, R., Rauscher. F.J. 3d. Abate, C. Cuaan, T (1 989) Science, 243, 1 695-9; Neuberg, M., Schuemiann. M.. 
Hunter, J. B., Muller, R. (1989) Nature, 338, 589-90). At this time, 12 types of DNA templates in total were prepared 
by using 6 types of primers (SP6-029. T7-029, SP6-AO. T7.AO. T7-0\ T7-K) for the 5Merminal end and 6 types of 
primers (FlagXA, FlagX, nagXA(G3), FlagXA(CI). FlagA. Rag) for the 3'-tenTiinal side. That is. PCR was performed 
4 times for each DNA template under the following conditions by using TaKaRa ExTaq (Takara Shuzo), and purification 
was perfomied by using QIAquick PCR Purifrcation Kit (QIAGEN). 



Table 1 



(PCR reaction solution) 


10 x Ex Buffer 


lOpl 


2.5 mM dNTP 


8}al 


DEPC water 


76.7 ^l 


ExTaq 


0,3 \i\ 


Template (1 nmol/p.1) 


1 ^il 


Primer 1 (20 pmol/jil) 


2fil 


Primer 2 (20 pmol/p.1) 


2|xl 


Total 


100 Hi 
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Table 2 



(Template, primers and reaction conditions) 


Template 


c-jun[F] 


(SEC ID NO: 1) 




Primer 1 


5* SP6-029 


(SEC ID NO: 2) 


SP6 + partial omega 








sequence (029) 




5' T7-029 


(SEC ID NO: 3) 


T7 + partial omega 








sequence (029) 




5' SP6-AO 


(SEC ID NO: 4) 


SP6 + full-length 








omega sequence 




5' T7-AO 


(SEC ID NO: 5) 


T7 + full-length omega 








sequence 




5' T7-0' 


(SEC ID NO: 6) 


T7 + partial omega 








sequence (O*) not 








overlapping 029 




5* T7-K 


(SEC ID NO: 7) 


T7 + Kozak sequence 


Primer 2 


3' FlagXA 


(SEC ID NO: 8) 


Flag + modified Xho\ 








site + polyA 




3' RagX 


(SEC ID NO: 9) 


Flag + modified Xhol 








site 




3' FlagXA{G3) 


(SEC ID NO: 10) 


Flag + modified Xhol 








site + poly A 




3' F!agXA{C1) 


(SEC ID NO: 11) 


j-iag •! mociTsed xnoj 








site + polyA 




3* FlagA 


(SEC ID NO: 12) 


Flag + polyA 




3' Flag 


(SEC ID NO: 13) 


Flag 


Program 


94°C 


2 mm 






35 cycles of the following 






94*»C 


30 sec 






62° C 


30 sec 






74°C 


1 min 





[0066] 6 to 12 \ig of the following coding molecules (DNA templates) were obtained by the above method. For con- 
venience, the coding molecules were designated as "(name of Primer 1)Jun-(name of Primer 2)" according to the 
primers used for PCR. DNA or RNA is indicated in the brackets. 

40 

[DNA] SP6-029Jun-FlagXA, yield: 10 ^ig 

[DNA] T7-029Jun- FlagXA. yield: 12 ^ig 

[DNA] SP6-AOJun-FlagXA, yield: 10 ^ig 

[DNA] T7-AOJun-FlagXA. yield: 7 ^g 
45 [DNA] T7-0'Jun-nagXA, yield: 9 jig 

[DNA] T7-KJun-FlagXA. yield: 1 0 jig 

[DNA] SP6-029Jun-FlagX, yield: 7 \Lg 

(DNA] SP6-029Jun-FlagXA(G3). yield: 8 ^ig 

[DNA] SP6-029Jun-FlagXA(CI), yield: 6 
50 [DNA] SP6-029Jun-FlagA. yield: 10 ^ig 

[DNA] SP6-029Jun-Flag. yield: 10 p.g 

[0067] Subsequently, RNA templates (= coding molecules) were prepared from DNA templates by transcription. That 
is, the above-obtained DNA templates were transcribed (37**C, 2 hours) under the following conditions by using Ri- 
55 boMAXTM Large Scale RNA Production Systems (Promega), and purification was performed by using RNeasy Mini 
Kit (QIAGEN). 
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Table 3 



(Transcription reaction solution) 


5 X SP6 buffer 


10fil 


Nucleotide mixture (ATP/(y/1TP/CTP), 25 mM 


10 rI 


GTP, iOmlVi 


7.5 ^il 


Cap Analog (m7G(5*)ppp(5')G), 40 mM 


9.4 iL\ 


DNA template 


1 ^9 


SP6 enzyme mixture 


5}il 


RNase-free water 


Remainder 


Total 


50p.l 



[0068] By the above method. 250 to 500 pmoi of the following coding molecules (RNA templates) were obtained. 

[RNA] SP6-029Jun-FlagXA, yield: 450 pmoi 
[RNA] T7-029Jun-FlagXA, yield: 350 pmoi 
[RNA] SP6-AOJun-FlagXA, yield: 400 pmoi 
[RNA] T7-AOJun-FlagXA, yield: 500 pmoi 
[RNA] T7-0'Jun-FlagXA, yield: 350 pmoi 
[RNA] T7-KJun-FlagXA, yield: 500 pmo! 
[RNA] SP6-029Jun-FlagX yield: 250 pmoi 
[RNA] SP6-029Jun-FlagXA(G3), yield: 300 pmoi 
[RNA] SP6-029Jun-FlagXA(C1). yield: 350 pmoi 
[RNA] SP6-029Jun-FiagA yield: 400. pmoi 
[RNA] SP6-029Jun-Flag, yield: 400 pmoi 

(3) Translation of coding molecule 

(3-1) Translation reaction in wheat germ cell-free translation system 

(0069] The coding molecules (RNA templates) were translated (26''C, 60 minutes) under the following conditions by 
using Wheat Gemn Extract (Promega). Along with the translation, proteins were labeled (Miyamoto-Sato, E., Nemoto, 
N.. Kobayashi, K., and Yanagawa. H. (2000) Nucleic Acids Res.. 28. 1176-1182; Nemoto, N,. Miyamoto-Sato. E, and 
Yanagawa. H. (1 999) FEBS Lett., 462. 43-46). Electrophoresis was performed by 1 7,5% SDS-PAGE. and fluorescence 
of fluorescein in the bands was measured by using a multi-format image analyzer. Molecular Imager FX (Blo-Rad). 



Table 4 



(Translation reaction solution) 


Amino acid mixture, 1 mM 


0,8 jal 


1 M Potassium 


0.76 ^il 


RNase inhibitor, 10 U/jxl 


0.8 \i\ 


RNA template (coding molecule) 


4 pmoi 


Wheat Germ Extract (Promega) 


5 ^il 


Fluoro-dCpPuro, 400 jiM 


0.6 ixl 


RNase-free water 


Remainder 


Total 


lOpJ 



[0070] Jun proteins having a molecular weight of about 25 kDa were obtained from the coding molecules (RNA 
templates) by the above method and quantified by image analysis. The Jun proteins obtained from the respective 
coding molecules were designated as shown in the following table. 
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Table 5 





Coding molecule (RNA template) 


Jun protein 




|HNAJ bro-U29Jun-FlagXA 


P-SP6-029 = p-FXA 


(b) 


[RNA] T7-029Jun-FlagXA 


P-T7-029 


(c) 


[RNA] SP6-AOJun-FlagXA 


D-SP6-AO 


(d) 


[RNA] T7-AOJun-FlagXA 


P-T7-AO 


(e) 


[RNA] T7-0'Jun-FlagXA 


P-T7-0' 


(f) 


[RNA] T7-KJun-FIagXA 


p-T7^K 


(g) 


[RNA] SP6-029Jun-FlagX 


p-FX 


(h) 


[RNA] SP6-029Jun-FlagXA(G3) 


p-FX'A 


(i) 


[RNA] SP6-029Jun-FlagXA(C1) 


p-FX"A 


(j) 


[RNA] SP6-029Jun-FlagA 


p-FA 



[0071] Fig. 7, A shows the results of comparison of the above (a) to (f) about efficiency (relative rate) of translation 
into the Jun proteins. These results indicate that translation efficiency became high when 5' UTR contained a tran- 
scription promoter (SP6) of SP6 RNA polymerase as a transcription promoter (p-SP6-AO and p-SP6-029) and when 

20 it contained a part of tobacco mosaic virus TMV omega (029) as an enhancer sequence (p-SP6-029 and p-T7-029), 
and that translation efficiency became particularly high when both were contained (p-SP6-029). 
[0072] Fig. 7, B shows the results of comparison of the above (a) and (g) to (j) about efficiency of translation into the 
Jun proteins. These results indicate that translation efficiency became high when the polyA sequence was contained 
in the 3'-terminaI region (p-FA), and when the polyA sequence and the Xho\ sequence were contained in the 3'-terminal 

25 region (p-FXA). That is. the translation efficiency is significantly different when the Xho\ sequence bonded to a Flag 
tag and the sanie further bonded with the poiyA sequence are compared. Further, as effect of the combination of the 
Xho\ sequence and the potyA sequence, translation efficiency for the Flag tag bonded with Xho\ sequence and polyA 
sequence is much higher than that for the Flag tag bonded with only the polyA sequence. Further, since substitution 
of 1 residue In the Xh<A sequence showed strong Influence, the Xho\ sequence itself is also important. 

30 

(4) Ligation of spacer molecule and coding molecule 

[0073] A spacer molecule containing a PEG region and a coding molecule (RNA template) were ligated {15°C, 20 
hours) under the following conditions by using T4 RNA ligase (Takara Shuzo) and purified by using RNeasy Mini Kit 

35 (QIAGEN). The ligation product (genotype molecule of c-jun) was subjected to electrophoresis by 8 M Urea 4% PAGE, 
and fluorescence of ethidium bromide (EtBr) and fluorescein in the molecule were detected by using a multi-format 
image analyzer, Molecular Imager FX (Bio-Rad). Further, similar detection was performed for the spacer molecules 
and the coding molecules of G2 and G4 by changing the ligation conditions to IS^'C and 4 hours and adding free PEG 
having the same molecular weight as that of PEG in the PEG region to the ligation reaction mixture or without adding 

40 it. Detection for the G4 molecule was further performed by changing the amount of the spacer molecules to 40 nmol 
and the amount of the free PEG to 80 nmol. 



Table 6 



(Ligation reaction solution) 


Spacer molecule containing PEG 


20 nmol 


(Free PEG 


60 nmol, when free PEG 




is added) 


RNA template (coding molecule) 


50 pmol 


10 X buffer 


5 ul 


0.1 M DTT 


1.5 jil 


40 mM ATP 


0.5 ul 


DMSO 


10jil 


BSA 


3 ^il 


RNase inhibitor 


1 Hi 


T4 RNA ligase 


10^1 


RNase-free water 


Remainder 



15 




EP1 350 845 A1 



10 



Table 6 (continued) 



(Ligation reaction solution) 



Total 



50 fil 



[0074] The following genotype molecules, which were ligation products of various spacer molecules and coding 
molecules, were obtained by the above method with ligation efficiency of 20 to 95%. 

[0075] The ligation efficiency is a relative ratio of the band intensity of the genotype molecules based on the total of 
band intensity of coding molecules and band intensity of genotype molecules taken as 100%. The band intensities 
were analyzed by using a multi-format image analyzer for the bands obtained by electrophoresis of the molecules by 
8 M Urea 4% SDS-PAGE, followed by ethidium bromide staining of the nucleic acids in the gel. 



Table 7 



Spacer molecule 


Coding molecule 


Genotype molecule 


p(dCp)oT(FI)pP EG( 1 000)D( dCoUPuro 


iniNMj or^D-vj/iyjun-rjagAM 




p(dCp)oT{Fl)pPEG(2000)DfdCDUPuro 


iniMMj ofo-i^^yjun-'riagAA 


G2 


p(dCp)oT{FI)„PEG(3000)ofdCDUPuro 




G3 


p(dCp)9T(FILPEG(4000) (dCoUPuro 


iniMMj or'o-^j^cyjun-riagAA 


G4 


p(dCp)oT(FI)DPEG(1 OOO^DdCoPuro 


j^riiNMj or'o-o^yjun-riagAA 


G5 


p(dCp)2T(Fl)pPEG(2000)pdCpPuro 


TRMAI *^Pfi-OQQ liir» ClonYA 

l^niMMj or^D-w^yjun-riagAA 


G6 


p(dCp)oT(FnpPEG(3000^DdCDPuro 


iniNMj orD-^^^yjun-rjagAA 


G7 


p{dCp)2T(FI)pP EG(4000)pdCpPuro 


fRMAl QPft-09Q Inn PlonVA 

^niMMj oro-^^iyjun-riayAA 


G8 


p{dCp)2PEG(1 000)p(dCp)2Puro 


[RNA] SP6-029Jun-FlagXA 


G9 


p(dCp}2pEG{2000>p(dCp)2puro 


[RNA] SP6-029Jun-FlagXA 


G10 


p(dCp)2PEG(3000)p(dCp)2Puro 


[RfMA] SP6-029Jun-FlagXA 


Gli 


p(dCp)2PEG(3000)pdCpPuro 


[RNA] SP6-029Jun-FlagXA 


G12 


p(dCp)2T(Bio)pPEG(2000)pdCpPuro 


[RNA] SP6-029Jun-FlagXA 


G13 


p(dCp)2T(Bio)pPEG(2000)p(dCp)2Puro 


[RNA] SP6-029Jun-FlagXA 


G14 


p(dCp)2T(Bio)pT(Fl)pPEG(2000)pdCpPuro 


[RNA] SP6-029Jun-FlagXA 


G15 


p{dCp)2T(Bio)pT(FI)pPEG(2000)p(dCp)2Puro 


[RNA] SP6-029Jun-FlagXA 


G16 


p{dCp)2T(FI)pPEG(2000)p(dCp)2Puro 


[RNA] SP6-029Jun-FlagXA 


G17(= G2) 


p(dCp)2T(FI)pPEG(2000)p(dCp)2Puro 


[RNA] T7-029Jun-FlagXA 


G18 


p(dCp)2T(FI)pPEG(2000)p(dCp)2Puro 


[RNA] SP6-AOJun-FIagXA 


G19 


p(dCp)2T(FI)pPEG(2000)p(dCp)2Puro 


[RNA] T7-AOJun-FlagXA 


G20 


p(dCp)2T(FI)pPEG(2000)p{dCp)2Puro 


[RNA] T7-0'Jun-FlagXA 


G21 


p{dCp)2T(FI)pPEG(2000)p(dCp)2Puro 


[RNA] T7-KJun-FlagXA 


G22 


p(dCp)2T(FI)pPEG(2000)p(dCp)2Puro 


(RNA] SP6-029Jun-FiagX 


G23 


p(dCp)2T(FI)pPEG(2000)p(dCp)2Puro 


[RNA] SP6-029Jun-FlagXA(G3) 


G24 


p(dCp)2T(FI)pPEG{2000)p(dCp)2Puro 


[RNA] SP6-029Jun-FlagXA(C1) 


G25 


p(dCp)2T{FI)pPEG(2000)p(dCp)2puro 


[RNA] SP6-029Jun-FiagA 


G26 


p{dCp)2T(FI)pPEG(2000)p(dCp)2Puro 


[RNA] SP6-029Jun-Flag 


G27 



Table 8 



Genotype molecule 




G1: g-(FI)PEG(1000)dCdC 
G2: g-(n)PEG(2000)dCdC 
G3: g-(Fl)PEG(3000)dCdC 
G4: g-(FI)PEG(4000)dCdC 

G5: g-(Fl)PEG(1000)dC 
G6: g-(FI)PEG(2000)dC 


See Fig. 10 for efficiency. 
See Fig. 10 for efficiency. 
See Fig. 10 for efficiency. 

Efficiency: 73% (see Fig. 1 0 efficiency when amounts of 
spacer molecule and free PEG are changed) 

Efficiency: 75% 
Efficiency: 64% 
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Table 8 (continued) 



Genotype nnolecule 




G7: g-(Fl)PEG{3000)dC 


Efficiency: 58% 


G8: g-(FI)PEG(4000)dC 


Efficiency: 40% 


G9: g-PEG(1000)dCdC 


Efficiency: 98% 


G10: g-PEG(2000)dCdC 


Efficiency: 90% 


G 11 : g-PEG{3000)dCdC 


Efficiency: 85% 


G12: g-PEG(3000)dC 


Efficiency: 84% 


G13: g-(Bio)PEG(2000)dC 


Efficiency: 85% 


G14: g-(Bio)PEG(2000)dCdC 


Efficiency: 83% 


G15: g-(B!o)(R)PEG(200p)dC 


Efficiency: 75% 


G16: g-(Bio)(FI)PEG(2000)dCdC 


Efficiency: 78% 


G17(= G2): g-{FI)PEG{2000)dCdC(g-SP6-O29 = 


See Fig. 9 for efficiency. 


g^EXA) 




G18: g-(Fl) PEG(2000) dCdC{g-T7'029) 


See Fig. 9 for efficiency. 


G19: g-(FI)PEG(2Q00)dCdC(g-SP6-AO) 


See Fig. 9 for efficiency. 


G20: g-(FI)PEG(2000)dCdC(g-T7-AO) 


See Fig. 9 for efficiency. 


G2i : g-(FI)PEG{2000)dCdC(g'T7-O ) 


See Fig. 9 for efficiency. 


G22: g-(FJ)PEG(2000)dCdC(g-T7-K) 


See Fig. 9 for efficiency. 


G23: g-(FI)PEG(2000)dCdC(g-FX) 


See Fig. 9 for efficiency. 


G24: g-(FI)PEG(2000)dCdC(g-FX'A) 


See Fig. 9 for efficiency. 


G25: g-{FI)PEG(2000)dCdC{g-FX"A) 


See Fig. 9 for efficiency. 


G26: g-(Fi)PEG(2000)dCdC(g-FA) 


See Fig. 9 for efficiency. 


G27: g-(FI)PEG(2000)dCdC(g-F) 


See Fig. 9 for efficiency. 
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[0076] Fig. 9, A shows the results of comparison of efficiencies for G23 (g-FX), G17 (g-FXA), G24 (g-FX'A), G25 
(g-FX"A), G27 (g-F) and G26 (g-FA). Fig. 9, B shows the results of comparison of efficiencies for G20 (g-T7-AO), G1 9 
(g-SP6-AO). G18 (g-T7-029). G1 7 (g-SP6-029). G21 (g-T7-0') and G22 (g-T7-K). 

[0077] As shown in Fig. 9, A, high ligation efficiency was obtained when a pplyA sequence was contained in the 3'- 
terminat region. The ligation efficiency of the coding molecule and the spacer molecule was conventionally 40% or 
lower (Nemoto, N.. Miyamoto-Sato, E„ Yanagawa, H. (1997) FEBS Lett., 414, 405; Roberts, R.W. & Szostak, J.W. 
(1997) Proc. Natl. Acad. Sci. USA, 94, 12297). However, ligation efficiency affected by the 3'-terminal end sequence 
of the coding portion is made high by adopting the 3'-terminal region containing a polyA sequence. Further, as shown 
In Fig, 9, B. high ligation efficiency of 60 to 80% can be obtained thanks to the polyA sequence irrespective of sequences 
upstream thereof. 

[0078] Fig. 10 shows comparison with experiments for G2 and G4 where free PEG having the same molecular weight 
as that of PEG in the PEG region was added. 

[0079] Although the ligation efficiency tended to decline as the molecular weight of polyethylene glycol in the PEG 
region Increased ("Not added" in Fig. 10), ligation efficiency was improved by adding free PEG having the same mo- 
lecular weight as PEG in the PEG region upon the ligation reaction irrespective of the molecular weight of PEG in the 
spacer molecule (90% for G2 and 73% for G4 when the spacer molecule is 20 nmol), and the separation process after 
the reaction could be omitted. 

[0080] Further, the ligation efficiency was further improved by adjusting the mixing ratio of the coding molecules and 
the spacer molecules in the ligation reaction mixture (85% for G4). and hence the ligation efficiency as high as 80 to 
90% was obtained irrespective of the molecular weight of polyethylene glycol ("free PEG added" in Fig. 10). 

(5) Construction of assigning molecule 

(5-1) Assigning translation in wheat germ cell-free translation system 

[0081] Wheat Germ Extract (Promega) was used as a wheat germ cell-free translation system, and Rabbit Reticu- 
locyte Lysate System (Promega) was used as a rabbit reticulocyte cell-free translation system. The c-jun genotype 
molecules obtained by the above ligation were added to each of the translation systems to perform assigning translation, 
and the assigning molecules were subjected to electrophoresis by 8 M Urea 11% SDS-PAGE and detected based on 
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fluorescence of fluorescein by using a multi-format image analyzer, Molecular Imager FX (Blo-Rad). No post-translation 
treatment was perfomied, and the undiluted solution from the translation system was subjected to the electrophoresis. 



Table 9 



Rabbit reticulocyte cell-free translation system: 30*'C, 30 min 


Amino acid mixture, 1 Mm 
RNase inhibitor, 20 U/|il 
Genotype (spacer portion + 
coding portion) 
Rabbit Reticulocyte Lysate 
RNase-free water 


02 [L\ 
0.4 p.1 
2 pmol 

7.0 ^1 

Remainder 


Total 


10>il 



Table 10 



Wheat germ cell-free translation system: 26'*C, 30 min 


Amino acid mixture, 1 mM 


0.8 p.1 


1 M Potassium 


0.76 }il 


RNase inhibitor, 1 0 U/^l 


0.8 fil 


Genotype (spacer portion + 


2 pmol 


coding portion) 




Wheat Germ Extract 


5.0 |xl 


RNase-f ree water 


Remainder 


Total 


lOjj-l 



[0082] The following assigning molecules comprising various c-jun genotypes and the phenotypes thereof were ob- 
tained with assigning efficiency of 4 to 60% by the above method. 

[0083] The assigning efficiency is a relative ratio of the fluorescence intensity of the assigning molecules based on 
the total of the band intensity of genotype molecules and the band intensity of assigning molecules taken as 100%. 
The molecules were subjected to electrophoresis by 8 M Urea 4% SDS-PAGE. and fluorescence intensity of the bands 
of the genotypes and fluorescence intensity of the assigning molecules were analyzed by using a multi-format image 
analyzer. 



Table 11 



Genotype molecule 


Assigning molecule 


G1: g-(FI)PEG(1000)dCdC 
G2: g-(FI)PEG(2000)dCdC 
G3: g-(FI)PEG(3000)dCdC 
G4: g-(R)PEG(4000)dCdC 
G5: g-(FI)PEG(1000)dC 
G6: g-(FI)PEG(2000)dC 
G7: g-(FI)PEG(3000)dC 
G8: g-(FI)PEG(4000)dC 

G17: g-(FI)PEG(2000)dCdC(g-SP6-O29 = g-FXA) 

G 18: g-(Fi)PEG(2000)dCdC(g-T7-O29) 

G1 9: g-(FI)PEG(2000)dCdC(g-SP6-AO) 

G20: g-(FI)PEG(2000)dCdC(g-T7-AO) 

G21 : g-(FI)PEG(2000)dCdC(g-T7-O') 

G22: g-(FI)PEG(2000)dCdC(g-T7-K) 


v-(FI)PEG(1000)dCdC 

v-(FI)PEG(2000)dCdC 

v-(Fl)PEG(3000)dCdC 

v-(FI)PEG(4000)dCdC 

v-(FI)PEG(1000)dC . 

v-(FI)PEG(2000)dC 

v-(FI)PEG(3000)dC 

v-(FI)PEG(4000)dC 

v-(FI)PEG(2000)dCdC(v-SP6-O29) 

v-(FI)PEG(2000)dCdC(v-T7-O29) 

v-(FI)PEG(2000)dCdC(v-SP6-AO) 

v-(FI)PEG(2000)dCdC(v-T7-AO) 

v-(FI)PEG(2000)dCdC(v-T7-O') 

v-(Fl)PEG(2000)dCdC(v-T7-K) 



[0084] Fig. 4 shows the results of the assigning translation using the assigning molecule of G2 (= G17). Lane 1 
represents a genotype molecule, Lane 2 represents a product of assigning translation of the genotype molecule in the 
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wheat germ cell-free translation system. Lane 3 represents a product obtained by adding 20 fiM puromycin as a protein 
synthesis inhibitor under the same conditions as those used for Lane 2, and Lane 4 represents a product obtained by 
decomposing the protein with protease K after the reaction for Lane 2. 

[0085] The assigning molecule was observed only in Lane 2. This indicates that the assigning molecule was con- 
structed by translation in the wheat germ cell-free translation system. This also indicates that detection is possible 
without post-treatment after the translation. 

[0088] Further, it is shown that assigning molecules conventionally detected with Rl requiring much labor and time 
can be readily detected with fluorescence by using genotype molecules containing a spacer portion with a fluorescent 
substance introduced Into the function-imparting unit (x). 

(5-2) Stability of assigning molecule 



IS 



[0087] Stability of genotype molecule containing the spacer of the present Invention, and that of genotype molecules 
containing the following conventional spacers were compared in the wheat germ and rabbit reticulocyte cell-free trans- 
lation systems. The experiment conditions were the same as in the above (5-1 ) except that translation was performed 
at 30*»C for 20 minutes in the rabbit system and at 26»C for 20 minutes in the wheat system in the presence of 20 }iM 
puromycin as the translation inhibitor The results are shown in Fig. 5. A. The numericalvalues in the graph represent 
the ratios of the remaining genotype molecules based on the amount before translation as a reference. 
[0088] The coding-molecules, the spacer molecules and the genotype molecules used in this experiment are shown 



20 below. 



Table 12 
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Coding molecule 


Spacer molecule 


Genotype molecule 


25 


[RNAJ SP6-029Jun-FlagXA 


p(dCp)2T(FI)pPEG(3000)p(dCp)2Puro 


G2 




[RNA] SP6-029Jun-FiagXA 


p(dCp)2T(Fi)pPEG(200u)p(dCp)2puro 


G3 




fRNA] SP6-029Jun-FlagX 


p(FI)(dAp)2i [C9]3dAp(dCp)2Puro 


30F 




[RNA] SP6-029Jun-FlagX 


p(FI){dAp)27(dCp)2Puro 


30S 




G2: g-(FI)PEG(3000)dCdC 






30 


G3: g-(FI)PEG(3000)dCdC 








30F: g-(dAp)2i(FI)[C9]3dCdC 








SOS: g-(dAp)2i(FI)(dAp)6dCdC 







[0089] It can be seen that the genotype molecules containing the spacer portion of the present invention showed 
significantly higher stability in both of the rabbit and wheat translation systems in comparison with those using the 
conventional DNA spacer (S30 spacer: Liu R.. Barrick E.. Szoztak J.W.. Roberts. R.W. (2000) Methods in Enzymology. 
318, 268-293) or the spacer containing DNA as a main component with polyethylene glycol having a molecular weight 
of 400 or less (F30 spacer: Liu R:, Barrick E.. Szoztak J.W., Roberts. R.W. (2000) Methods in Enzymology. 318, 
268-293) (Fig. 5, A), Therefore, post-treatment after the assigning translation conventionally required to improve as- 
signing efficiency becomes unnecessary. Since treatment after the assigning translation can be simplified, working 
time for assigning translation can be shortened from 48 to 72 hours to 0.5 to 1 hour 

[0090] Further, it was observed that stability of the genotype tended to increase as the molecular weight of polyeth- 
ylene glycol increased. The stability was favorable with PEG having a molecular weight of 1000 or more, and the 
genotype had almost the same property as that of the DNA spacer and was unstable with PEG having a molecular 
weight of 400 or less (Fig. 5, A). 

[0091] Further, the stability of the genotype of the present invention was compared in the wheat germ and rabbit 
reticulocyte cell-free translation systems. The experiment conditions were the same as in the above (5-1) except that 
translation was perfomied In the presence of 20 jaM puromycin as translation inhibitor at 30*»G for 0.25, 0.5. 1 and 2 
hours in the rabbit system and at 26"='C for 1 . 2, 4 and 8 hours in the wheat system. The genotype molecule used in 
this experiment was G17 = G2 [g-(FI)PEG(2000)dCdC(g-SP6-O29)]. The results are shown in Fig. 5. B. Numerical 
values in the graph represent the ratios of the remaining genotype molecules based on the amount before the translation 
as a reference. 

[0092] It can be seen that the stability of the genotype molecule was higher in the wheat germ cell-free translation 
system than in the rabbit reticulocyte cell-free translation system (Fig. 5, 8). Since a spacer portion containing a PEG 
region as a main component makes a genotype molecule more stable especially in the wheat germ celi-free translation 
system. It becomes possible to construct a library including coding portions with a long chain length by using genotype 
molecules containing such a spacer portion, and hence a library with high diversity can be obtained. 
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(5-3) Optimization of acceptor region and PEG region 

[0093] In order to optimize the acceptor region and the PEG region of the spacer molecule of the present invention, 
construction efficiency of the assigning molecules was compared in the wheat germ and rabbit reticulocyte cell-free 
translation systems. The experiment conditions were the same as described above except that the acceptor region 
had a constitution of dC-puromycin, spacer molecules containing the PEG regions having molecular weights of 1O00, 
2000. 3000 and 4000 were used, and translation was performed at 30°C for 0.5 hours in the rabbit system and at 26**C 
for 0.5 hours In the wheat system. The results are shown In Fig. 6, A. The assigning efficiency was calculated by the 
aforementioned method. The genotype molecules and the assigning molecules used in this experiment were G5 to G8. 
[0094] Further, the results of the experiment conducted under the same conditions except that the acceptor region 
had the constitution of dCdC-puromycin are shown in Fig. 6, B. The genotype molecules and the assigning molecules 
used in this experiment were G1 to G4, 

[0095] As shown in Fig. 6, it was found that, for a spacer portion having puromycin and dC (deoxycytidine) sequence 
of 1 residue in its acceptor region, polyethylene glycol should preferably have a molecular weight of 1000 or more, 
more preferably 2000 or more, further preferably 4000 or more <Flg. 6, A). It was also found that, for a spacer portion 
having puromycin and the dCdC sequence in its acceptor region, the effect was exhibited with polyethylene glycol 
having a molecular weight of 1000 or more, and it was more preferably 2000 or more, further preferably 2000 to 4000 
(Fig. 6. B). 

(5-4) Optimization of 5' UTR 

[0096] In order to investigate the influence of the translation efficiency of the coding portion of the present invention 
on the assigning efficiency, assigning molecules were constructed by using genotype molecules containing coding 
portions having 5* UTR with different translation efficiency, and their construction efficiency was compared. The exper- 
iment conditions were the same as in the above (5-1 ) except that assigning translation was performed at 26°C for 0.5 
hours in the wheat germ cell-free translation system. The results ^e shown in Fig, 8. The assigning efficiency was 
calculated in the same manner as described in the above (5-3). The genotype molecules and the assigning molecules 
used in this experiment were G17 to G22. 

[0097] From the result shown in Fig. 8. it can be seen that the transcription promoter (SP6) of SP6 RNA polymerase 
Is more preferred than T7 as a transcription promoter, and that a partial sequence of the omega sequence (029) is 
preferred as a translation enhancer. 

[0098] Further, from the result shown in Figs. 7 and 8, it was found that the translation efficiency of the coding portion 
showed positive correlations with the assigning efficiency, and that the translation efficiency of the coding portion had 
improved. Thus, it became clear that the assigning efficiency had been improved- Therefore, It can be considered that 
the components contributing to the improvement of the translation efficiency also contribute to the improvement of the 
translation efficiency also for the 3*-terminal end sequence. 

(6) Size of library using assigning molecule of the present invention 

[0099] Size of a library using the assigning molecules of the present invention in a unit volume was calculated. As 
the calculation method, the absolute amount of the assigning molecules in 1 ml was calculated from the concentration 
of the genotype molecules, the ratio of remaining genotype molecules and the construction efficiency of the assigning 
molecules, and it was considered as the size of the library. The experiment conditions were the same as in the above 
(5-1) except that translation was perfomied at 26°C for 1 hour in the wheat system and at 30°C for 0.5 hours in the 
rabbit system. The results are shown in the following table. The genotype molecule and the assigning molecule used 
in this experiment were G1 7 (= G2). 
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Table 13 
wheat germ 



Genot.ype 
(nM) 


. Assigning 

^■f f lei ^Tir»v 

(%> 


Assigning 
'( pmol ) 


Library x 
10"/inl 


100 


57 


52 




3. 1 




200 


51 


94 




5.6 




400 


37 


136 




8.2 




800 


21 


155 




9.3 




1600 


10 


147 




8.8 





1.0 hour, ligation efficiency: 70% , genotype remaining 
ratio: 0.92 



Rabbit reticulocyte 

Genotype Assigning 



concentration 
(nM) 

100 
200 
400 
800 



efficiency 
(%) 
32 

23 

14 

5 



Assigning 
molecule 
(pmol) 

28 
41 
50 
36 



Library x 
10^^/ml 

1-7 
2.5 



3.0 



2.2 



0.5 hour, ligation efficiency: 70%, genotype remaining 
ratio: 0.89 



[0100] As clearly seen from the results shown in the above table, the construction efficiency of the assigning mole- 
cules was improved from 0%, to 50 to 60% in the wheat germ cell-free translation system and from 10% or lower, to 
20 to 30% in the rabbit reticulocyte celi-free translation system. Further, the scale of the library was improved from 
0/ml (construction was impossible) to 10^*/ml in the wheat genu ceil-free translation system and from 1 0^^/m\ to 1 0'^/ 
ml in the rabbit reticulocyte cell -free translation system. 

[0101] Further, although the rabbit reticulocyte system was not so practical due to lack of stability of genotype mol- 
ecule and conventionally applied only to genotype molecules having a short chain length (Roberts. R,W. & Szostak, 
J.W, (1997) Proc, Natl, Acad. Sci. USA, 94, 12297; Nemoto. N,, Miyamoto-Sato, E.. Yanagawa. H. (1997) FEBS Lett., 
414, 405), it can be seen from the results shown in Figs. 4 and 5 and the above table that an assigning molecule having 
a spacer portion containing a PEG region is more stable in the wheat germ system, and that the wheat germ system 
using such an assigning molecule is a practically useful system where long chain lengths can be can handled. 



Industrial Applicability 



[0102] A coding molecule having a long chain length can be handled in a wheat germ translation system by using 
the molecule of the present invention. That is, a practically useful assigning translation system is provided. 
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SEQUENCE LISTING 



<110> Keio University 

<120> Molecule for Assining Genotype to Phenotype and Components Thereo 
f as well as Method for Constructing Assigning Molecule and Method for U 
tilizing Assigning Molecule 

<130> P121-0P127S 

<150> JP 2000-380562 
<151> 2000-12-14 

<160> 13 

<210> 1 
<211> .378 
<2I2> DNA 

<213> Artificial Sequence 
<220> 

<223> PGR template containing part of c-jun sequence; c-jun[F] 
<400> 1 

atggctagca tgactggtgg acagcaaatg ggtgcggccg cgccggagat gccgggagag 60 
acgccgcccc tgtcccctat cgacatggag tctcaggagc ggatcaaggc agagaggaag 120 
cgcatgagga accgcattgc cgcctccaag tgccggaaaa ggaagctgga gcggatcgct 180 
cggctagagg aaaaagtgaa aaccttgaaa gcgcaaaact ccgagctggc atccacggcc 240 
aacatgctca gggaacaggt ggcacagctt aagcagaaag tcatgaacca cgttaacagt 300 
gggtgccaac tcatgctaac gcagcagttg caaacgttta gaccgcggga ctacaaggac 360 
gatgacgaca agctcgag 378 

<210> 2 
<211> 76 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PGR primer; .SP6-029 
<400> 2 

atttaggtga cactatagaa caacaacaac aacaaacaac aacaaaatgg ctagcatgac 60 
tggtggacag caaatg 76 
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<210> 3 
<211> 76 
<212> DNA 

<213> Artificial Sequence 

<220> . 

<223> PGR primer; T7-029 
<400> 3 

taatacgact cactataggg caacaacaac aacaaacaac aacaaaatgg ctagcatgac 
tggtggacag caaatg 

<210> 4 

<2n> 73 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PGR primer; . SP6-A0 
<400> 4 

taatacgact cactataggg agaccacaac ggtttcccat ttaggtgaca ctatagaata 
cacggaattc gcg 

<210> 5 
<211> 73 
<212> DM 

<213> Artificial Sequence 
<220> 

<223> PGR primer; T7-A0 
<400> 5 

taatacgact cactataggg agaccacaac ggtttcccat ttaggtgaca ctatagaata 
cacggaattc gcg 

<210> 6 

<211> 72 

<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> PGR primer; T7-0' 
<400> 6 

taatacgact cactataggg acaattacta tttacaatta caatggctag catgactggt 60 



<210> 7 

<211> 71 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PC8 primer; T7-K 

<400> 7 

taatacgact cactataggg agaccacaac ggtttcccgc cgccaccatg gctagcatga 60 
ctggtggaca g J I 

<210> 8 

<211> 36 

<212> DMA 

<213> Artificial Sequence 
<220> 

<223> PCR primer; FlagXA 



ggacagcaaa tg 



72 



35 



<400> 8 

ttttttttct cgagcttgtc gtcatcgtcc ttgtag 



36 



40 



<210> 
<211> 
<212> 
<213> 



9 

28 
DNA 

Artificial Sequence 



45 



<220> 
<223> 



PCR primer; FlagX 



<400> 9 

ctcgagcttg tcgtcatcgt ccttgtag 



28 



so 



<210> 10 
<211> 38 
<212> DNA 



55 
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<213> Artificial Sequence 
<220> 

<223> PCR primer; FlagXA(G3) 
<400> 10 

ttttttttct cgaccttgtc gtcatcgtcc ttgtagtc 

<210> 11 

<211> 38 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR primer; FlagXA(Cl) 

<400> 11 

ttttttttgt cgagcttgtc gtcatcgtcc ttgtagtc 

<2I0> 12 

<211> 35 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR primer; FlagA 

<400> 12 

ttttttttct tgtcgtcatc gtccttgtag tcccg 

<210> 13 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR primer; Flag 

<400> 13 

gactacaagg acgatgacga caag 
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Claims 

1. A spacer molecule comprising a donor region which can be bonded to a S'-tenninal end of nucleic acid, a PEG 
region that is bonded to the donor region and comprises polyethylene glycol as a main component and a peptide 
acceptor region which is bonded to the PEG region and comprises a group which can be bonded to a peptide by 
transpeptidation. 

2. The spacer molecule according to claim 1 , wherein the peptide acceptor region comprises puromycin or a derivative 
thereof, or puromycin or a derivative thereof and one or two residues of deoxyribonucleotides or ribonucleotides. 

3. The spacer molecule according to claim 1 or 2, which comrlses at least one function-imparting unit between the 
donor region and the PEG region. 

4. The spacer molecule according to claim 3. wherein the function-imparting unit is at least one residue of functionally 
'5 modified deoxyribonucleotide or ribonucleotide. 

5. A coding molecule, which is a nucleic acid comprising a 5* untranslated region comprising a transcription promoter 
and a translation enhancer; an ORF region that is bonded to the 3'-terminal side of the 5* untranslated region and 
encodes a protein; and a 3*-temiinal region which Is bonded to the 3'-terminal side of the ORF region and comprises 
a polyA sequence and a sequence which a restriction enzyme Xho\ recognizes on the 5'-temninal side of the polyA 

sequence. 

6. The coding molecule according to claim 5. wherein the transcription promoter is a promoter of SP6 RNA polymer- 



ase. 



7. The coding molecule according to claim 5 or 6. wherein the translation enhancer is a part of the TMV omega 
sequence of tobacco mosaic virus (029). 

8. The coding molecule according to any one of claims 5 to 7, which comprises an affinity tag sequence in a portion 
30 downstream from the ORF region. 

9. The coding molecule according to claim 8, wherein the affinity tag sequence is a Flag-tag sequence, which is a 
tag for affinity separation and analysis based on an antigen-antibody reaction. 

35 10. A genotype molecule constructed by bonding a 3'-terminal end of a coding molecule whFch is a nucleic acid com- 
prising a 5' untranslated region comrising a transcription promoter and a translation enhancer; an ORF region 
which is bonded to the 3'-temriinal side of the 5* untranslated region and encodes a protein; and a 3*-terminal region 
that is bonded to the 3'-terminal side of the ORF region and comprises a polyA sequence to the donor region of 
the spacer molecule as defined in any one of claims 1 to 4. 

40 

11. The genotype molecule according to claim 10, wherein the transcription promoter is a promoter of SP6 RNA 
polymerase. 

12. The genotype molecule according to claim 1 0 or 1 1 , wherein the translation enhancer is a part of the TMV omega 
45 sequence of tobacco mosaic virus (029). 

13. The genotype molecule according to any one of claim 10 to 12, wherein the 3*-tenninal end sequence comprises 
a sequence which a restriction enzyme Xhol recognizes on the S'-terminal end side of the polyA sequence. 



14. The genotype molecule according to any one of claims 10 to 13. which comprising an affinity tag sequence in a 
portion downstream from the ORF region. 

15. The genotype molecule according to claim 14. wherein the affinity tag sequence is a Flag-tag sequence, which is 
a tag for affinity separation and analysis based on an antigen-antibody reaction. 

1 6. A method for constructing a genotype molecule, which comprises bonding (a) a 3'-terminal end of a coding molecule 
which is RNA comprising a 5' untranslated region comprising a transcription promoter and a translation enhancer; 
an ORF region which Is bonded on the 3'-terminal side of the 5' untranslated region and encodes a protein; and 
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a 3* -terminal region which is bonded to the 3'-ternnina) side of the ORF region and comprises a polyA sequence, 
to (b) the donor region of the spacer molecule. as defined in any one of claims 1 to 4, which. comprises RNA, by 
using RNA ligase in the presence of free polyethylene glycol having the same molecular weight as that of poly- 
ethylene glycol constituting the PEG region in the spacer molecule. 

An assigning molecule constructed by ligating the genotype molecule as defined in any one of claims 10 to 15 to 
a phenotype molecule which is a protein encoded by the ORF region in the genotype molecule, by transpeptidation. 

A method for constructing an assigning molecule, which comprises translating the genotype molecule as defined 
in any one of claims 10 to 15 in a celi-free translation system to ligate the genotype molecule to a phenotype 
molecule which is a protein encoded by the ORF region in the genotype molecule, by transpeptidation. 

The method according to claim 1 8. wherein the cell-free translation system is a wheat germ cell-free translation 
.system. 

IS 

20. The method according to claim 18, wherein the cell-free translation system is a rabbit reticulocyte cell-free trans- 
lation system. 

21. A method for screening a nucleotide sequence encoding a protein which acts on a target substance, which com- 
20 prises measuring an interaction between a decoded portion of an assigning molecule and the target substance by 

using a library comprising a plurality of the assigning molecules as defined in claim 17, among which at least a 
part of the assigning molecules have different sequences of the ORF regions in their coding portions, and detecting 
the nucleotide sequence of the coding portion of the assigning molecule exhibiting the interaction. 
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